Here, we’re just setting a few options.
knitr::opts_chunk$set(
warning = TRUE, # show warnings during codebook generation
message = TRUE, # show messages during codebook generation
error = TRUE, # do not interrupt codebook generation in case of errors,
# usually better for debugging
echo = TRUE # show R code
)
ggplot2::theme_set(ggplot2::theme_bw())
Now, we’re preparing our data for the codebook.
library(codebook)
codebook_data <- codebook::bfi
# to import an SPSS file from the same folder uncomment and edit the line below
# codebook_data <- rio::import("mydata.sav")
# for Stata
# codebook_data <- rio::import("mydata.dta")
# for CSV
codebook_data <- rio::import("national_parks/data/park_biodiversity_data/parks.csv")
# omit the following lines, if your missing values are already properly labelled
codebook_data <- detect_missing(codebook_data,
only_labelled = TRUE, # only labelled values are autodetected as
# missing
negative_values_are_missing = FALSE, # negative values are missing values
ninety_nine_problems = TRUE, # 99/999 are missing values, if they
# are more than 5 MAD from the median
)
# If you are not using formr, the codebook package needs to guess which items
# form a scale. The following line finds item aggregates with names like this:
# scale = scale_1 + scale_2R + scale_3R
# identifying these aggregates allows the codebook function to
# automatically compute reliabilities.
# However, it will not reverse items automatically.
codebook_data <- detect_scales(codebook_data)
Create codebook
codebook(codebook_data)
## No missing values.
Dataset name: codebook_data
The dataset has N=56 rows and 6 columns. 56 rows have no missing values on any column.
|
#Variables
Distribution of values for Park Code
0 missing values.
| name | data_type | n_missing | complete_rate | n_unique | empty | min | max | whitespace | label |
|---|---|---|---|---|---|---|---|---|---|
| Park Code | character | 0 | 1 | 56 | 0 | 4 | 4 | 0 | NA |
Distribution of values for Park Name
0 missing values.
| name | data_type | n_missing | complete_rate | n_unique | empty | min | max | whitespace | label |
|---|---|---|---|---|---|---|---|---|---|
| Park Name | character | 0 | 1 | 56 | 0 | 18 | 46 | 0 | NA |
Distribution of values for State
0 missing values.
| name | data_type | n_missing | complete_rate | n_unique | empty | min | max | whitespace | label |
|---|---|---|---|---|---|---|---|---|---|
| State | character | 0 | 1 | 27 | 0 | 2 | 10 | 0 | NA |
Distribution of values for Acres
0 missing values.
| name | data_type | n_missing | complete_rate | min | median | max | mean | sd | hist | label |
|---|---|---|---|---|---|---|---|---|---|---|
| Acres | numeric | 0 | 1 | 5550 | 238764 | 8323148 | 927929.1 | 1709258 | ▇▁▁▁▁ | NA |
Distribution of values for Latitude
0 missing values.
| name | data_type | n_missing | complete_rate | min | median | max | mean | sd | hist | label |
|---|---|---|---|---|---|---|---|---|---|---|
| Latitude | numeric | 0 | 1 | 19 | 39 | 68 | 41.23393 | 10.90883 | ▂▇▅▁▂ | NA |
Distribution of values for Longitude
0 missing values.
| name | data_type | n_missing | complete_rate | min | median | max | mean | sd | hist | label |
|---|---|---|---|---|---|---|---|---|---|---|
| Longitude | numeric | 0 | 1 | -159 | -111 | -68 | -113.2348 | 22.44029 | ▂▁▇▂▂ | NA |
The following JSON-LD can be found by search engines, if you share this codebook publicly on the web.
{
"name": "codebook_data",
"datePublished": "2021-04-29",
"description": "The dataset has N=56 rows and 6 columns.\n56 rows have no missing values on any column.\n\n\n## Table of variables\nThis table contains variable names, labels, and number of missing values.\nSee the complete codebook for more.\n\n|name |label | n_missing|\n|:---------|:-----|---------:|\n|Park Code |NA | 0|\n|Park Name |NA | 0|\n|State |NA | 0|\n|Acres |NA | 0|\n|Latitude |NA | 0|\n|Longitude |NA | 0|\n\n### Note\nThis dataset was automatically described using the [codebook R package](https://rubenarslan.github.io/codebook/) (version 0.9.2).",
"keywords": ["Park Code", "Park Name", "State", "Acres", "Latitude", "Longitude"],
"@context": "http://schema.org/",
"@type": "Dataset",
"variableMeasured": [
{
"name": "Park Code",
"@type": "propertyValue"
},
{
"name": "Park Name",
"@type": "propertyValue"
},
{
"name": "State",
"@type": "propertyValue"
},
{
"name": "Acres",
"@type": "propertyValue"
},
{
"name": "Latitude",
"@type": "propertyValue"
},
{
"name": "Longitude",
"@type": "propertyValue"
}
]
}`